install: Enable installing to multi device parents#1911
install: Enable installing to multi device parents#1911ckyrouac wants to merge 9 commits intobootc-dev:mainfrom
Conversation
There was a problem hiding this comment.
Code Review
This pull request successfully enables installing to multi-device parent filesystems, such as LVM spanning multiple disks. It correctly discovers all parent devices and, for bootupd/GRUB, installs the bootloader to all devices with an ESP partition. For bootloaders that only support single-device configurations like systemd-boot and zipl, the implementation correctly defaults to using the first available device. The changes are well-architected, adapting data structures and logic to handle multiple devices. A new, thorough integration test validates both single and dual ESP scenarios. Overall, this is a solid enhancement with good error handling and logging. I have one suggestion to further improve the robustness of ESP detection.
f2a175a to
f7b1892
Compare
|
waiting to merge until the patch release goes out |
77b65cb to
d03c6fa
Compare
d03c6fa to
9b1c313
Compare
6802697 to
081f3b2
Compare
081f3b2 to
9d0e284
Compare
9d0e284 to
4ccc192
Compare
| # See https://tmt.readthedocs.io/en/stable/stories/features.html#reboot-during-test | ||
| match $env.TMT_REBOOT_COUNT? { | ||
| null | "0" => test_single_esp, | ||
| "1" => { test_dual_esp; test_three_devices_partial_esp; tmt-reboot }, |
There was a problem hiding this comment.
How about add another test scenario like RAID1 that using the whole disks (see coreos/bootupd#1059), for example:
root@localhost-live:/home/fedora# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINTS
vda 253:0 0 30G 0 disk
└─md126 9:126 0 30G 0 raid1
├─md126p1 259:0 0 477M 0 part
├─md126p2 259:1 0 954M 0 part /boot
└─md126p3 259:2 0 28.6G 0 part /
vdb 253:16 0 30G 0 disk
└─md126 9:126 0 30G 0 raid1
├─md126p1 259:0 0 477M 0 part
├─md126p2 259:1 0 954M 0 part /boot
└─md126p3 259:2 0 28.6G 0 part /
To create RAID1 using command:
sudo mdadm -CR /dev/md126 -e 1 -l1 -n 2 /dev/loop0 /dev/loop1 --assume-clean
541691d to
baa6c67
Compare
8f69693 to
1794593
Compare
cgwalters
left a comment
There was a problem hiding this comment.
Looks sane to me, I have just nits really that are nonblocking, we can fix in followups too.
crates/lib/src/bootloader.rs
Outdated
| BwrapCmd::new(&target_root) | ||
| .setenv( | ||
| "PATH", | ||
| "/bin:/usr/bin:/sbin:/usr/sbin:/usr/local/bin:/usr/local/sbin", |
There was a problem hiding this comment.
Let's use a shared const for this or perhaps better have set_default_path() on BwrapCmd
| CLOUDEOF | ||
| fi | ||
|
|
||
| # Temporary: update bootupd from @CoreOS/continuous copr until |
There was a problem hiding this comment.
This is a bit tricky as we probably also want to CI test the case without the new bootupd too...
I know, another matrix entry in theory...
I guess it'll be tested in post submits by the workflow test.
| # Test that bootc install to-existing-root can find and use ESP partitions | ||
| # when the root filesystem spans multiple backing devices (e.g., LVM across disks). | ||
| # | ||
| # Five scenarios are tested across three reboot cycles: |
There was a problem hiding this comment.
We're just testing installation to loopback, AFAICS there's no reason to reboot the host.
That said what would be a stronger test here is for us to test these setups probably via Anaconda where we boot into a guest VM with that. I tried to streamline this in bootc-dev/bcvk#202 but yeah it needs a bit of work.
I guess one thing we could do is extend the anaconda stuff we already have here in our integration tests.
Yet a different alternative is we ask tmt to directly support kickstart (and copy over some of the bcvk smarts there re mounting host container storage via virtiofs etc).
9da140b to
b7d4d61
Compare
| /// files before reinstallation. On multi-device setups only the first | ||
| /// ESP is mounted and cleaned; stale files on additional ESPs are left | ||
| /// in place (bootupd will overwrite them during installation). | ||
| // TODO: clean all ESPs on multi-device setups |
There was a problem hiding this comment.
Hmm yeah maybe should move this logic into bootupd
| *) copr_distro="centos-stream" ;; | ||
| esac | ||
| # Update bootc from rhcontainerbot copr; the new bootupd | ||
| # requires a newer bootc than what ships in some base images. |
There was a problem hiding this comment.
But wait we're building this as part of our CI here right?
| enabled=1 | ||
| enabled_metadata=1 | ||
| REPOEOF | ||
| dnf -y install bootupd-0.2.32.41.gb788553 |
There was a problem hiding this comment.
Isn't this the main thing we need?
b7d4d61 to
b245dc6
Compare
find_vmlinuz_initrd_duplicates() previously opened the absolute path /sysroot/state/deploy via STATE_DIR_ABS, which during install belongs to the host, not the target. Fix by passing a Dir opened relative to the target's physical_root. During fresh install the state dir naturally does not exist yet, so open_dir_optional returns None and the check is skipped -- no special-case guard needed. Also plumb physical_root through the setup_composefs_bls_boot match arms, and replace the hardcoded "/sysroot" in the Upgrade arm with storage.physical_root_path. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The composefs BLS and UKI boot setup paths called find_partition_of_esp() directly on the device, which fails when the root filesystem is on an LVM logical volume (the ESP is on the parent disk, not the LV). The store module had the same issue via require_single_root() + find_partition_of_esp(). Switch all call sites to find_colocated_esps() which walks up to the physical disk(s) via find_all_roots() before searching for the ESP, consistent with what install_systemd_boot and mount_esp_part already do. Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The test was using `get_target_image` which returns the upstream `docker://quay.io/centos-bootc/centos-bootc:stream10`. On composefs+grub variants provisioned with an updated bootupd from copr, the upstream image has stock bootupd with incompatible EFI update metadata, causing the install to fail with "Failed to find EFI update metadata". Switch to using `containers-storage:localhost/bootc` (the locally-built image), matching the pattern used by test-32, test-37, and test-38. The locally-built image has the updated bootupd with compatible metadata. Assisted-by: Claude Code (Opus 4) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
The initial change to use locally-built images had two additional issues
on composefs:
1. containers-storage: transport fails on composefs's read-only root
with "mkdir /.local: read-only file system". Fix by exporting the
image to an OCI layout directory on writable /var/tmp instead.
2. run_install() was masking /sysroot/ostree and removing bootupd update
metadata, which composefs needs for bootloader installation and boot
binaries. Fix by making run_install() skip these ostree-specific
workarounds on composefs systems.
Note: the composefs install-outside-container code path still has a
separate bug ("Shared boot binaries not found" in boot.rs:745) that
needs fixing in the Rust code.
Assisted-by: Claude Code (Opus 4)
Signed-off-by: ckyrouac <ckyrouac@redhat.com>
Signed-off-by: Colin Walters <walters@verbum.org>
The multi-device ESP test creates ESP partitions and expects bootupd to install a UEFI bootloader. On BIOS-booted systems, bootupd instead tries to install GRUB for i386-pc, which requires a BIOS Boot Partition and fails. The test plan already requests UEFI provisioning via the hardware hint, but Testing Farm does not always honor this on CentOS Stream x86_64. Add a runtime check for /sys/firmware/efi so the test skips gracefully on BIOS hosts rather than failing. Assisted-by: Claude Code (Opus 4.6) Signed-off-by: ckyrouac <ckyrouac@redhat.com> Signed-off-by: Colin Walters <walters@verbum.org>
Extract the repeated PATH environment variable string into a set_default_path() method on BwrapCmd. The bwrap environment may not have a complete PATH, causing tools like bootupctl or sfdisk to not be found. This consolidates the workaround into one place. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
Several improvements to ESP partition discovery: Add find_partition_of_esp_optional() returning Result<Option<&Device>> to cleanly separate three outcomes: found, absent, and genuinely unexpected errors (like unsupported partition table types). The existing find_partition_of_esp() is now a thin wrapper that converts None to Err. Add find_first_colocated_esp() helper to replace a 10-line pattern that was repeated verbatim 5 times across boot.rs and store/mod.rs. Deduplicate roots in find_all_roots() using a seen-set: in complex topologies like multipath, multiple parent branches can converge on the same physical disk. find_colocated_esps() now uses the optional variant to properly propagate real errors while treating absence normally. Also extract the match-on-if-else in setup_composefs_bls_boot into a let binding for readability. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
The no-ESP test only checked for a non-zero exit code, which would also pass if podman itself failed for unrelated reasons. Check that the output contains "ESP" to confirm the right failure mode. Assisted-by: OpenCode (Claude Opus 4) Signed-off-by: Colin Walters <walters@verbum.org>
b245dc6 to
52602b4
Compare
|
OK I pushed a few cleanups here, when you get back tomorrow can you look @ckyrouac ? |
When the root filesystem spans multiple backing devices (e.g., LVM across multiple disks), discover all parent devices and find ESP partitions on each. For bootupd/GRUB, install the bootloader to all devices with an ESP partition, enabling boot from any disk in a multi-disk setup. systemd-boot and zipl only support single-device configurations.
This adds a new integration test validating both single-ESP and dual-ESP multi-device scenarios.
Fixes: #481
Assisted-by: Claude Code (Opus 4.5)